51 research outputs found
Dimension reduction for Gaussian process emulation: an application to the influence of bathymetry on tsunami heights
High accuracy complex computer models, or simulators, require large resources
in time and memory to produce realistic results. Statistical emulators are
computationally cheap approximations of such simulators. They can be built to
replace simulators for various purposes, such as the propagation of
uncertainties from inputs to outputs or the calibration of some internal
parameters against observations. However, when the input space is of high
dimension, the construction of an emulator can become prohibitively expensive.
In this paper, we introduce a joint framework merging emulation with dimension
reduction in order to overcome this hurdle. The gradient-based kernel dimension
reduction technique is chosen due to its ability to drastically decrease
dimensionality with little loss in information. The Gaussian process emulation
technique is combined with this dimension reduction approach. Our proposed
approach provides an answer to the dimension reduction issue in emulation for a
wide range of simulation problems that cannot be tackled using existing
methods. The efficiency and accuracy of the proposed framework is demonstrated
theoretically, and compared with other methods on an elliptic partial
differential equation (PDE) problem. We finally present a realistic application
to tsunami modeling. The uncertainties in the bathymetry (seafloor elevation)
are modeled as high-dimensional realizations of a spatial process using a
geostatistical approach. Our dimension-reduced emulation enables us to compute
the impact of these uncertainties on resulting possible tsunami wave heights
near-shore and on-shore. We observe a significant increase in the spread of
uncertainties in the tsunami heights due to the contribution of the bathymetry
uncertainties. These results highlight the need to include the effect of
uncertainties in the bathymetry in tsunami early warnings and risk assessments.Comment: 26 pages, 8 figures, 2 table
Sequential Design with Mutual Information for Computer Experiments (MICE): Emulation of a Tsunami Model
Computer simulators can be computationally intensive to run over a large
number of input values, as required for optimization and various uncertainty
quantification tasks. The standard paradigm for the design and analysis of
computer experiments is to employ Gaussian random fields to model computer
simulators. Gaussian process models are trained on input-output data obtained
from simulation runs at various input values. Following this approach, we
propose a sequential design algorithm, MICE (Mutual Information for Computer
Experiments), that adaptively selects the input values at which to run the
computer simulator, in order to maximize the expected information gain (mutual
information) over the input space. The superior computational efficiency of the
MICE algorithm compared to other algorithms is demonstrated by test functions,
and a tsunami simulator with overall gains of up to 20% in that case
Efficient spatial modelling using the SPDE approach with bivariate splines
Gaussian fields (GFs) are frequently used in spatial statistics for their
versatility. The associated computational cost can be a bottleneck, especially
in realistic applications. It has been shown that computational efficiency can
be gained by doing the computations using Gaussian Markov random fields (GMRFs)
as the GFs can be seen as weak solutions to corresponding stochastic partial
differential equations (SPDEs) using piecewise linear finite elements. We
introduce a new class of representations of GFs with bivariate splines instead
of finite elements. This allows an easier implementation of piecewise
polynomial representations of various degrees. It leads to GMRFs that can be
inferred efficiently and can be easily extended to non-stationary fields. The
solutions approximated with higher order bivariate splines converge faster,
hence the computational cost can be alleviated. Numerical simulations using
both real and simulated data also demonstrate that our framework increases the
flexibility and efficiency.Comment: 26 pages, 7 figures and 3 table
Linked Gaussian Process Emulation for Systems of Computer Models Using Matérn Kernels and Adaptive Design
The state-of-the-art linked Gaussian process offers a way to build analytical
emulators for systems of computer models. We generalize the closed form
expressions for the linked Gaussian process under the squared exponential
kernel to a class of Mat\'ern kernels, that are essential in advanced
applications. An iterative procedure to construct linked Gaussian processes as
surrogate models for any feed-forward systems of computer models is presented
and illustrated on a feed-back coupled satellite system. We also introduce an
adaptive design algorithm that could increase the approximation accuracy of
linked Gaussian process surrogates with reduced computational costs on running
expensive computer systems, by allocating runs and refining emulators of
individual sub-models based on their heterogeneous functional complexity
Deep Gaussian Process Emulation using Stochastic Imputation
We propose a novel deep Gaussian process (DGP) inference method for computer
model emulation using stochastic imputation. By stochastically imputing the
latent layers, the approach transforms the DGP into the linked GP, a
state-of-the-art surrogate model formed by linking a system of feed-forward
coupled GPs. This transformation renders a simple while efficient DGP training
procedure that only involves optimizations of conventional stationary GPs. In
addition, the analytically tractable mean and variance of the linked GP allows
one to implement predictions from DGP emulators in a fast and accurate manner.
We demonstrate the method in a series of synthetic examples and real-world
applications, and show that it is a competitive candidate for efficient DGP
surrogate modeling in comparison to the variational inference and the
fully-Bayesian approach. A package
implementing the method is also produced and available at
https://github.com/mingdeyu/DGP
Multi-level emulation of tsunami simulations over Cilacap, South Java, Indonesia
Carrying out a Probabilistic Tsunami Hazard Assessment (PTHA) requires a large number of simulations done at a high resolution. Statistical emulation builds a surrogate to replace the simulator and thus reduces computational costs when propagating uncertainties from the earthquake sources to the tsunami inundations. To reduce further these costs, we propose here to build emulators that exploit multiple levels of resolution and a sequential design of computer experiments. By running a few tsunami simulations at high resolution and many more simulations at lower resolutions we are able to provide realistic assessments whereas, for the same budget, using only the high resolution tsunami simulations do not provide a satisfactory outcome. As a result, PTHA can be considered with higher precision using the highest spatial resolutions, and for impacts over larger regions. We provide an illustration to the city of Cilacap in Indonesia that demonstrates the benefit of our approach
Robust uncertainty quantification of the volume of tsunami ionospheric holes for the 2011 Tohoku-Oki earthquake: towards low-cost satellite-based tsunami warning systems
We develop a new method to analyze the total electron content (TEC) depression in the ionosphere after a tsunami occurrence. We employ Gaussian process regression to accurately estimate the TEC disturbance every 30 s using satellite observations from the global navigation satellite system (GNSS) network, even over regions without measurements. We face multiple challenges. First, the impact of the acoustic wave generated by a tsunami onto TEC levels is nonlinear and anisotropic. Second, observation points are moving. Third, the measured data are not uniformly distributed in the targeting range. Nevertheless, our method always computes the electron density depression volumes, along with estimated uncertainties, when applied to the 2011 Tohoku-Oki earthquake, even with random selections of only 5 % of the 1000 GPS Earth Observation Network System receivers considered here over Japan. Also, the statistically estimated TEC depression area mostly overlaps the range of the initial tsunami, which indicates that our method can potentially be used to estimate the initial tsunami. The method can warn of a tsunami event within 15 min of the earthquake, at high levels of confidence, even with a sparse receiver network. Hence, it is potentially applicable worldwide using the existing GNSS network
Probabilistic Landslide Tsunami Estimation in the Makassar Strait, Indonesia, Using Statistical Emulation
This paper presents a significant advancement in the understanding of tsunamigenic landslide hazard across the length of the Makassar Strait in Indonesia. We use statistical emulation across the length of the continental slope to conduct a probabilistic assessment of tsunami hazard on a regional scale, across 14 virtual coastal gauges. Focusing on the potential maximum wave amplitudes (distance between the wave crest and the still‐water level) from possible tsunamigenic landslide events, we generate predictions from Gaussian Process emulators fitted to input‐outputs from 50 training scenarios. We show that the most probable maximum wave amplitudes in the majority of gauges are between 1 and 5 m, with the maximum predicted amplitudes reaching values of up to 10 m on the eastern coast, and up to 50 m on the western coast. We also explore the potential use of Gaussian multivariate copulas to sample emulator prediction input values to create a more realistic distribution of volumes along the continental slope. The novel use of statistical emulation across a whole slope enables the probabilistic assessment of tsunami hazard due to landslides on a regional scale. This area is of key interest to Indonesia since the new capital will be established in the East Kalimantan region on the western side of the Makassar Strait
- …